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ABSTRACT 

We present Easylife, the software environment developed within the framework of the VIPERS 
project for automatic data reduction and survey handling. Easylife is a comprehensive system 
to automatically reduce spectroscopic data, to monitor the survey advancement at all stages, to 
distribute data within the collaboration and to release data to the whole community. It is based 
on the OPTICON founded project EASE, and inherits the EASE capabilities of modularity and 
scalability. After describing the software architecture, the main reduction and quality control 
features and the main services made available, we show its performance in terms of reliability of 
results. We also show how it can be ported to other projects having different characteristics. 
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Thanks to the continuous evolution of astro- 
nomical instrumentation and in particular of the 
multiplexing gain of faint-object spectrographs, 
large-scale spectroscopic surveys have become a 
real industry, in which up to 10^ spectra can 
be accumulated by a single project. So far, 
this has been particularly true for redshift sur- 
veys of the "local" Universe {z ^ 0.1), with 
the notable milestones represented by the 2dE 
Gala xy Redshift Survey (2dEGRS, ICoUess et af 



Le Eevre et aD Il995[ ). which more recently grew 
to a few tens of thousands objects, with the ad- 
vent of new multi-object spectrographs on 8-m 
class tel escopes, like V IMOS and DEIMO S (e.g . 
VVDS, iLe Eevre et all 120051: ICarilh et all l2008t 



DEE P2, ICoil et al.l 12004 zCosmos. Ilillv et al 



20071 ) . Lately, a further increase in the size of sam- 
ples at intermediate redshift {z ^ 0.5) has been 
possible by targetting specific classes of galaxies, 
like star-for ming objects in the case of the Wig- 



20101) 



or massive 



2001[) and the SloaiiDigital Sky Survey (SDSS, (ISchlegel et al.l 120071 ). In particular, the total 

Eisenstein et al. | |200ll l Abazajian et al. | |2009), nroiect will be of the order 



gleZ survey (jPrinkwater et al. 
"reddish" galaxies in the case of SDSS3-B0SS 



which built over earlier pioneering projects of 
the 1 980 's and 1990 's as e.g. CfA redsh i ft sur - 
vey (iDavis et all Il982t iGeller fc Huchral Il989l). 
Perse us-Pisces dGiovanelh et al.lll986l). E SP (jvettolani et 
19971 ) and LGRS (IShectman et al.lll99i ). 

For obvius reasons, redshift surveys of the 
more distant Universe (z ^ 1), were limited to 
smaller numbers, with samples of a few hun- 
dred to a few thousands galaxies (e.g. CERS, 



yield for this latter project will be of the order 
of 10® spectra. This trend is expected to con- 
tinue with future redshift surveys, as it is the case 
-Ear the tens of milli ons redshifts expecte d for the 
^6A mission Euclid ( Laureiis et al. 201 ll). or G aia 



( Kontizas et al.|[2011 : Karampelas et al. 2012 ). 

Potentially, the amount of information provided 
by such large-scale surveys is enormous, but to 
exploit its full scientific potential, measurements 
have to be extracted from the raw data in a way 
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which is both efficient (thus with minimum hu- 
man intervention) and at the same time rehable; 
equally important, they have to be distributed 
to the community in an easily manageable form. 
With these goals in mind, a number of projects 
have developed automatic pipelines tuned for their 
own needs. 

Since its planning in the early 1990s, the 
SDSS dedicated a big effort to the creation 
of a full pipeline for the data reduction (e.g. 
Lupton et al.l 120021 ) and a parallel database sys- 
tem to handle the enormous (for that time) 
amou nt of photonietric and spectroscopic data 
(e.g. ISzalav et al.l 120021 ). Similar efforts were 
later implemented in particular by large photo- 
metric surveys, with the creation of data pro- 
cessing centr es, like Terapix for the CFHT obser- 
yations (.Bertin et al.ll2002l ). the UKIDSS center 
(IWarren et al.l l2007l) or the CANDELS pipeline 
( Grogin et al.l 2011 ). Among pure spectroscopic 
surveys, VVDS ( Scodeggio et al.„ 2005.'). zCo smos 
(iLillv et al.l l2007l) AGES (lAuld et al.l l2006h and 
more recently WiggleZ ( Drinkwater et aiT2010l ) 
have all built their own tools, eventually glue- 
ing together pre-existing algorithms and programs 
into an automatic processing. 

Data dissemination is the second important 
requirement these projects have to face. This in- 
cludes both internal distribution to the survey 
team, and public release to the scientific commu- 
nity; the latter may also include public outreach 
products. The Virtual Observatory has set up 
standards and conventional formats for this pur- 
pose (see http://www.ivoa.net/Documents/). VO 



compatible tools have been flourishing over recent 
years (Aladin, Topcat, VOSpec: for a more com- 
plete list see http://www.ivoa.net/cgi-bin/twiki/bin 
and are on the way of becoming the standard 
for data dissemination. Currently, however, 
each survey still tends to provide its own spe- 
cific web pages, from where data and informa- 
tion can be downloaded, either through plain 
ASCII files or via more sophisticated database sys- 
tems: SDSS (http://www.sdss.org/dr7/), DEEP 
|http://deep.berkeley.edu/DR3/) , VVDS (ht- 
tp://cesam. oamp.fr/vvdsproject/) and Cosmos 
(IhttpTy/cosm os. astro. caltech.edu/data| ) , are just 
some examples. 

A third important point in exploiting such large 
and long-lasting projects is book-keeping of the 



survey processes. This is usually kept by the 
project coordinator or by a restricted coordination 
group, not always using appropriate tools, with 
considerable expenditure of time. 

When we started the VIMOS Public Extra- 
galactic Redshift Survey (VIPERS) in 2008, we 
decided to invest time and manpower in a survey 
management system capable of automatically tak- 
ing care of: data reduction and redshift measure- 
ment, quality control, data dissemination (both 
internal and to the public) and logging. In this pa- 
per we describe the system we have set up, called 
Easylife. In section I2|3I and H] we briefly describe 
the VIMOS Public Extragalactic Redshift Survey 
(VIPERS 0) survey, the VIMOS spectrograph and 
the observing sequence to be followed within ESO 
projects. In section [5] we detail the requirements 
we have defined. The system architecture is briefly 
outlined in section [51 After a description of the 
main tools (section [7]) , we dedicate section |S] to 
the performance in terms of reduction quality we 
obtain with Easylife. In section [9] we show how we 
are using Easylife for other projects. 

2. The VIPERS survey 

VIPERS is an ongoing ESO Large Programme 
aimed at measuring redshifts for ^ 10^ galax- 
ies at redshift 0.5 < z < 1.2, to accurately 
and robustly measure clustering, the growth of 
structure (through redshift-space distortions) and 
galaxy properties at an epoch when the Universe 
was about half its current age. The galaxy sam- 
ple is selected from the Canada-France-Hawaii 
Telescope Legacy Survey Wide ( CFHTLS-Wide) 
optic al photometric catalogues ( Goranova et al.l 

/view/\^.v ^ft/lvoaAppTications) 



Within the Wl and W4 
CFHTLS fields. Galaxies arc selected to a limit 
of iAB < 22.5, further applying a simple and ro- 
bust gri colour pre-selection, as to effectively re- 
move galaxies at z < 0.5. Coupled to an a ggressive 
observing strategy ( Scodeggio et al.|[200 9h. this al- 
lows us to double the galaxy sampling rate in the 
redshift range of interest, with respect to a pure 
magnitude-limited sample, reaching a target sam- 
pling rate sampling of ^ 40%. At the same time, 
the area and depth of the survey result in a fairly 
large volume, 5 x 10^ h^"^ Mpc'^, analogous to that 



^http : //vipers . inaf ■ It | 
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of the 2dFGRS at z ^ 0.1. Such combination of 
samphng and depth is quite unique over current 
redshift surveys at z > 0.5. VIPERS Spectra are 
collect ed with the VIMOS multi-object spectro- 
graph ( Le Fevre et al. 2000 t) at moderate resolu- 
tion {R = 210), using the LR Red grism, providing 
a wavelength coverage of 5500-9500A and a typical 
radial velocity error of 175(1 -I- z) km sec~^ . The 
full VIPERS area of ~ 24 deg^ is covered through 
a mosaic of 288 VIMOS pointings (192 in the Wl 
area, and 96 in the W4 area). 

As of January 2012, about 60% of the VIPERS 
area has been observed, with completion expected 
by 2014. A first discussion of the spectral data 
together with Principal Component cl assification 
can be found in Marchetti et al. ( 2012[ ). More de- 
tails will be available in Guzzo et al. (2012, in 
preparation) . 

3. The VIMOS spectrograph 

VIMOS (Visible MultiObject Spectrograph) is 
an imaging spectrograph installed on Unit 3 (Meli- 
pal) of the ESO Very Large Telesco pe (VLT) at the 
Par an al Ob servatory in Ch il e (see Le Fevre et al.l 
(|2000l ) and iLe Fevre etHI (|2002l ) for a detailed 
description of the instrument and its capabilities) . 
The driving design concept for the instrument is 
to cover as much of the unvignetted part of the 
focal plane as possible at the VLT Nasmyth fo- 
cus (a circular area with a diameter of 22 arcmin 
on the sky). Since this large area corresponds to 
a very large linear scale (almost 1 m), it was de- 
cided that coverage would be achieved by splitting 
the instrument into four identical optical channels 
arranged next to each other and supported by the 
same mechanical structure. Each optical channel 
is a classical focal reducer imaging spectrograph, 
with a collimator providing a parallel beam where 
the dispersive element (a grism) is inserted, and a 
camera that focuses the beam onto a 2048x4096 15 
/im pixel EEV CCD. The focal plane is flattened 
by a field lens at the instrument entrance, to allow 
for flat multislit masks, and a folding mirror is in- 
serted into the collimator, to fold the beam and to 
reduce the instrument's overall length. The field 
of view covered by each channel (generally referred 
to as a VIMOS quadrant) is approximately 7'x8', 
with a pixel scale of 0.205 arcsec/pixel. An ap- 
proximately 2 arcmin wide gap is present between 



quadrants. 

For spectroscopic observations, six different 
grisms provide spectral resolutions ranging from 
R ^ 250 to 2500. Order-sorting filters are used 
to avoid an overlap between first and second grat- 
ing orders. Laser-cut masks (one per quadrant) 
are used for MOS observations. The number of 
slits that can be placed on each mask varies from 
approximately 40 at high spectral resolution up 
to approximately 250 at low spectral resolution. 
An imaging exposure acquired with VIMOS is re- 
quired as the starti ng point of the ma sk design 
and cutting process (jBottini et al.ll2005l ). 



4. VIMOS operations within the VIPERS 
context 

Preparing and submitting MOS observations 
with VIMOS requires a sequence of operations, as 
thoroughly explained in the VIMOS User's man- 
uals and ESO web pages. In service mode (which 
is the default observing mode) , once the pointing 
location has been chosen, the information needed 
to carry out pre-imaging has to be sent to ESO, 
together with the finding chart of the field. As 
soon as pre-imaging data are available, the user is 
asked to prepare the files needed to manufacture 
the masks needed for spectroscopy observations, 
and send them together with the other informa- 
tion required (instrument configuration, exposure 
time, observing sequence, etc). Mask prepara 



tion is done via VMMPS software (Bottini et al 



20051 ) distributed by ESO. Once the spectroscopic 
observations have been performed, the data can 
be retrieved from ESO archive, and reduced. Fi- 
nally, from the fiux and wavelength calibrated 
monodimensional spectra, redshift and other spec- 
tral quantities can be measured. 
For normal programs, none of these operations 
is particularly time consuming, nor demanding. 
It is when this sequence is to be applied to a 
survey which foresees of the order of hundreds 
of pointings and hundred thousands spectra (as 
VIPERS) that the need for automatization arises. 
Easylife is the system we have devised especially 
for VIPERS, but which can be easily adapted 
to other projects requiring a high degree of data 
reduction automatization. The whole reduction 
procedure is ba s ed on the pipeline described in 
Scodegg io et all (|2005[ ). which we have automa- 
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tized to a very high degree as explained in section 
17.2.31 The redshift measu rement is carrie d out us- 
ing EZ (Easy redshift, Garilh et aLll2010l ) in bhnd 
automatic mode. Even if EZ is rather efficient, 
especiaUy on this kind of data (see section 17. 3p , 
a human inspection of the spectra is required to 
vahdate the measurements and possibly recover a 
redshift for the faintest objects. This operation is 
performed by either one or two persons (accord- 
ing to data quality). Finally redshifts, together 
with redshift reliability flag, as well as mono and 
two dimensional spectra have to be fed back to 
the database for dissemination among the whole 
survey team. 

5. Software requirements 



VIPERS foresees to observe about 300 VIMOS 
pointings in four years: each pointing observation 
is split in 5 exposures, and each exposure cov- 
ers the 4 VIMOS quadrants. The expected data 
flow is thus of the order of 6000 raw data frames 
to be reduced and 100000 spectra to be mea- 
sured. For such a survey, a semi-manual proce- 
dure as adopte d for VVDS ( 



and z Cosmos ( Lillv et al. 



Scodeggio et all 120051) 



20071 ) is not efficient 



enough for the data reduction: VVDS was made 
up of 99 VIMOS pointings, and zCosmos of 90 
VIMOS pointings. Data reduction and redshift 
measurement for both surveys has been carried 
out in manual mode, using VIPGI and EZ, and 
had taken of the order of 5 years to be completed. 
Scaling to the VIPERS case, it would translate 
into 15 years of efforts just to reduce the data. 
Therefore, the automatization of the processing 
chain, including reduction and automatic redshift 
measurement, is the first requirement we had to 
meet. Such an automatic pipeline must run in un- 
supervised mode, but has to have built-in quality 
checks on crucial steps so that the output prod- 
ucts arc fully controlled. 

Periodic internal data releases to the VIPERS 
consortium must be foreseen, to allow for scien- 
tific exploitation even before the whole survey has 
been completed, as well as periodic public releases 
to the whole community. This can be easily ac- 
complished without additional workload if, after 
reduction and redshift measurement, all informa- 
tion is automatically entered in a database, which 
can be opened, in full or in part, when data must 
be released. 



Tasks like finding chart production and mask 
preparation, which cannot be automatized further 
with respect to the tools ESO provides, are carried 
out by different people, and the same applies to the 
redshift measurement task. Distributed work can 
be more efficient if handy tools to get the required 
input (e.g. pre-imaging for preparing masks, re- 
duced monodimensional spectra to measure spec- 
tra) and send back results are used. Web-driven 
upload and download procedures which take care 
of storing results and performing quality checks 
have to be provided. 

Finally, an adequate book-keeping must be pro- 
vided for several aspects: the managerial need of 
evenly distributing the workload among all part- 
ners, and the degree of advancement of each per- 
son and each task; information on the targets 
selected for the obervations has to be kept; since 
the program is spread along few years, it is ad- 
visable to keep track on when observations (both 
pre-imaging and spectroscopy) have been taken; 
and last, but not least, all consortium members 
must have the possibility to check what is the 
advancement in terms of observations, data re- 
duction, completeness, etc. 

Automatic reduction, database storage and book- 
keeping are the basic requirements we have set for 
Easylife, together with keeping to a minimum the 
need for supporting man power from the 'survey 
reduction center'. On top of these requirements, it 
was desirable to use reduction tools already fully 
tuned and tested, instead of rewriting all the re- 
quired routines from scratch. Finally, we wanted 
to create a flexible system which could be adapted 
to similar projects in which we are involved. 



6. Software architecture 

The three main requirements described in the 
previous section naturally lead to conceive a mod- 
ular sysem, where both reduction programs (usu- 
ally written in C language), databases (mySQL 
based) and web interfaces (developed in Java or 
HTML) can live together and flawlessly interact. 
The OPTICON Future Astronomical Software En- 



vironment (from here on, EASE, .Grosbol et al 



20051) is a scalable open system application frame- 
work with distributed capabilities, specifically 
studied for the astronomical software, which can 
by design satisfy all these needs. The EASE ar- 
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chitecture is described in Paioro et al. and 
here we recall the fundamental concepts. Follow- 
ing Figure [TJ the major system elements are as 
follows: 

• Presentation layer: is the part of the sys- 
tem which presents the user the various func- 
tionalities. The user can be a human, but 
also a Grid workflow, a Web browser inter- 
face, or whatever. The Presentation layer 
itself can be a Command Line Interface, a 
Graphical User Interface or a Web interface. 

• Application layer: the application layer 
is used to implement top level applications. 
The application layer can be anything which 
can drive the execution framework to ex- 
ecute components, for example. Python, 
Java, a GUI, or a work-flow engine of some 
sort. 

• Execution framework: provides the func- 
tionality needed to execute components, in- 
cluding capabilities such as component regis- 
tration and management, distributed execu- 
tion, scalability, messaging, logging, and so 
forth. Different execution frameworks, each 
one having different capabilities, can be im- 
plemented. 

• Container: components execute within a 
container which defines the life cycle and 
runtime environment seen by the compo- 
nent. The container is the interface between 
the execution framework and an individual 
component. 

• Components: a component is a computa- 
tional object, with one or more service meth- 
ods, which can be plugged into the frame- 
work. Components are grouped into compo- 
nent packages and provide most of the func- 
tionality of the system. 

The component-framework architecture out- 
lined here is a modular architecture in which the 
major elements of the system can be used sep- 
arately, as stand alone packages, or can be in- 
tegrated into other frameworks. The advantage 
of a modular architecture is that the major el- 
ements of the system can evolve independently, 
making it easier to use new technology as it be- 
comes available. Developing Easylife, we have 



made full use of the modularity FASE provides: 
some elements (the Reducer and to a certain ex- 
tent the Unpacker) were pre-existing and we have 
just plugged them in the global system after hav- 
ing built the appropriate container. At the same 
time we have been able to extend the system capa- 
bilities by adding extra components for the data 
reduction of other instruments (like LUCI and 
MODS at the LET, see section [9|). 

7. Easylife building blocks 

Following the architectural concept of FASE, 
the different tasks deriving from the requirements 
outlined in section [5] are handled through a ded- 
icated Easylife component and/or GUI. The sur- 
vey status can be monitored and managed through 
an administrative web site, which is also used to 
provide the public Web pages of the VIPERS sur- 
vey. Data ingestion, organization and reduction 
tasks, together with automatic redshift measure- 
ment, are carried out by dedicated tools running 
on a beowulf cluster. Finally, the results database 
is based on mySQL and accessed through a dedi- 
cated GUI. All these parts communicate with an 
administrative SQL-based database, which keeps 
track of the global status of the survey, and all 
together constitute the Easylife system. 

7.1. VIPERS administrative web site 

The administrative web site is the uppermost 
presentation layer of Easylife. It allows one 
to monitor the survey status, access all survey- 
related side products, as outlined below, and re- 
trieve data. 

The VIPERS administrative web site is built 
on top of a Web application framework running 
on a Jakarta Apache server. It allows one to 
serve normal static HTML pages as well as dy- 
namic pages. VIPERS pages are built upon a 
template system integrated within the Web ap- 
plication framework, which ensures homogeneity 
of the layout. The Web application framework is 
fully integrated within the Easylife management 
system, and directly accesses the underlying SQL 
database, which contains all the relevant infor- 
mation for the survey monitoring. It is struc- 
tured to have different access levels: a public part 



(http://vipers.inaf.it(), which describes the survey 



goals, shows the team composition and will con- 
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tain a summary of the most relevant results; a 
team restricted part and an administrative part, 
with access restricted to the PI and the admin- 
istration team. Through the private part, each 
member of the team can retrieve the information 
he/she may need, e.g. 

• inspect how the survey is advancing. An ex- 
ample is given in Figure [3] for the CFHTLS- 
Wl area. The different colors indicate the 
different advancement status of each point- 
ing (green for observed, yellow for reduced, 
red for fully measured, etc.). For each point- 
ing, relevant information such as date of ob- 
servation, metereological conditions during 
observations (through a link to observation 
logs provided by ESO), data quality, are ac- 
cessible by clicking on the pointing itself (see 
Figure . These figures are created on the 
fly from the SQL database holding all survey 
information, and thus are always automati- 
cally up to date. 

• Connect to the database system, provid- 
ing the photometric parent catalogs and 
the catalogs with the scientific informa- 
tion extracted from the spectra. The 
database system is based on DART software 
(jPaioro et al.l 120081 ) . a Web interface which 
allows one to query catalogs and access their 
associated data products (see section Fr4|) . 

• Have access to project documentation and 
meetings minutes, as well as to the VIPERS 
science wiki pages related to different inter- 
nal projects or working groups. 

• upload any VIPERS related publication, and 
look at publications or presentations given 
by team members 

• retrieve the data for mask preparation or 
redshift measurement and upload the results 

The administrative pages are reserved to the PI 
or project admistrator to 

• Assign the VIMOS mask preparation to the 
team members 

• Once data have been reduced, assign the red- 
shift measurement validation to the different 
team members. 



• Make new data releases, freezing the current 
status of the spectroscopic catalogs and la- 
beling them with a custom tag. Some statis- 
tics are then produced summarizing the sur- 
vey status and outcome. 

• keep track of the "service" work done by 
each team member, to avoid overload of 
some with respect to others. 

7.2. Data ingestion and reduction 

While the administrative web site allows one 
to handle the global phases of the survey process, 
data management and reduction are performed by 
a restricted data reduction group through a dedi- 
cated Graphical User Interface. Such GUI han- 
dles three main software elements, each of which 
is dedicated to a specific set of reduction and man- 
agement tasks: 



Unpacker f section 17.2.1^ : unpacks the raw 
data and prepares them for ingestion in the 
reduction system; 



Organizer (section [7.2.2^ : fills the database 
containing the pointing information and or- 
ganizes the data in a pre-defined structure, 
classifying each file by its attributes; 



• Reducer (section 17. 2. 3^ : reduces the raw 
data in order to produce mono-dimensional 
wavelength and flux calibrated spectra for 
each target object and measures the spec- 
troscopic redshifts. 

The Unpacker, Organizer and Reducer are used 
through the GUI in a seamless way, proposing the 
user to choose: (a) the project to be handled; (b) 
the raw data to be unpacked; (c) the pointings to 
be reduced and the related files to be used for the 
reduction; (d) launch the reduction process. 

7.2.1. Data preparation 

The Unpacker is the Easylife software element 
dedicated to ingest raw data into the reduction 
system. EasyLife has been conceived with the 
aim of being usable for several projects and several 
spectrographs. The purpose of the unpacker is to 
analyse the raw data it receives, discard whatever 
is not needed/ wanted and add to the header of the 
raw data files some conventional keywords, which 
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will allow the data to be classified according to the 
project/instrument they belong to. The exact be- 
haviour of the Unpacker is driven by configuration 
settings, which essentially indicate where, and in 
which form, the information required is contained 
in the raw data. At the end of the process, each 
raw data file contains standard hierarchical FITS 
keywords which contain the main information re- 
quired for classifying the file independently of the 
instrument: for example, the disperser used, the 
target name, the instrument name, the airmass, 
and others. The file is also renamed following a 
"human readable" syntax which allows one to im- 
mediately identify whether it is a scientific expo- 
sure, a flat fleld, which is the target and with which 
disperser it has been observed. 

Easylife hierarchical FITS keywords provide a 
conventional set of information irrespective of the 
instrument which has produced the data. This 
information is what the Organizer needs to classify 
the data. 

It is worth noting that this approach to data 
ingestion allows one to use Easylife for different 
projects and even instruments; for each target 
aplication (be it a survey with VIMOS, or sev- 
eral observations with another spectrograph), it is 
sufficient to configure a different Unpacker to cus- 
tomize Easylife for projects very different from the 
VIPERS survey it has been devised for. In section 
iniwe will show how we have already used Easylife 
for other projects. 

7.2.2. Data organization 

Once the data contain a set of standard infor- 
mation in a standard format, they can be easily 
classified and organized according to their con- 
tent. The classification is stored in a mySQL 
database (which is also accessed by the web inter- 
face, see section [73J, while the data management 
operations are performed through the Organizer, 
which provides the data organization and admin- 
istration functions. The Organizer handles multi- 
ple projects (VIPERS application is one project), 
providing a separate work space for each one. A 
project work space consists of: 1) a data stor- 
age area, which points to a well defined directory 
structure; 2) a set of database tables: the table 
holding the files attributes and their reduction sta- 
tus, the table collecting the administrative infor- 
mation concerning the pointings (or targets) and 



their global status, and the table containing the 
information on the calibration files and their va- 
lidity range. Every inquiring operation on the files 
and/or on the survey management process is per- 
formed by the different Easylife components ac- 
cessing the Organizer. The Organizer is thus the 
main element that allows one to orchestrate the 
entire management system. 

7.2.3. Data reduction and quality control 

The data reduction is performed with a special 
Easylife software component (the Reducer), which 
provides an automatic pipeline system equipped 
with a specific plug-in for the VIMOS instrument. 
Thanks to FASE distributed execution engine, the 
Reducer is able to process multiple observations at 
the same time, submitting the reduction processes 
to a Beowulf cluster. The reducti on steps and un- 
derlyi ng recipes are described in IScodeggio et al 



pOOSi ). and recalled in Figure [2j Briefiy, the im- 
plemented global data reduction scheme is a fairly 
traditional one, broadly following the one imple- 
mented by the IRAF longslit package: 1) loca- 
tion of spectral traces on the raw frames, 2) com- 
putation of the Inverse Dispersion Solution for 
each spectral trace, 3) sky subtraction on the non- 
calibrated data, 4) two dimensional extraction of 
spectra and application of the wavelength cali- 
bration, 5) combination of sequence of observa- 
tions 6) extraction of mono-dimensional spectra 
and correction for the isntrument sensitivity func- 
tion (flux calibration). A special effort was made 
to achieve a very high efficiency during the re- 
peated application of this scheme to the large set 
of VVDS data by tailoring all aspects of the data 
reduction scheme to the specific characteristics of 
VIMOS. Still, the various reduction functions are 
general enough that they can be adapted for the 
reduction of data produced by any MOS spectro- 
graph, with a min imal effort (see section IH] and 
Nastasi et alll2012l ). Such recipes, in their origi- 
nal form, formally always end successfully, but this 
does not automatically mean that the result meets 
the degree of accuracy required by the specific 
scientific need. For example, a spectrum can be 
successfully wavelength calibrated, but the wave- 
length calibration accuracy is of the order of 1 
pixel. This is clearly not enough if the redshift 
accuracy required is much higher than that. In 
the past, reduction results were always manually 
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checked and, when required, data were reduced 
again in order to improve resuhs. Given the high 
data flow of VIPERS (6000 raw frames), the fully 
automated pipeline must also assure that the re- 
duction results are scientifically exploitable. For 
these rea sons, on top of t he re duction flow de- 
scribed in lScodeggio et all ( 20051 ). we added some 
quality check steps. Every time one of such qual- 
ity checks is not satisfied, the reduction process is 
stopped and human intervention is required. We 
have explored the parameter space of each step 
in order to find the minumum (maximum) value 
above (below) which VIPERS data are scientifi- 
cally usable. Such limits are stored in a configura- 
tion file. The quality checks we perform, together 
with the the adopted limits are the following: 



1. 



Check on spectra location. Each VIPERS 
observation consists of several exposures, 
possibly spread over different nights. It 
is well known that VIMOS suffers from a 
fiexure problem (only recently fixed thanks 
to an Active Flexure Compensator, see 
Hammerslev et al. 2010l ) so that the loca- 



tion of the dispersed spectra on the differ- 
ent exposures can differ by few pixels from 
the expected positions. For this reason, the 
task computes the expected spectrum bor- 
der position and compares it with the real 
detected spectrum border. The median of 
this displacement is requested not to exceed 
1.5 pixels for 5% of the spectra in one VI- 
MOS quadrant. If these conditions are not 
satisfied, the expected position is not accu- 
rate enough to guarantee a good spectrum 
tracing and therefore exctraction in all ex- 
posures of the same field, and the procedure 
is stopped to allow for a manual adjustment 
of the slit position first guess. 

2. Check of wavelength calibration. Using the 
Inverse Dispersion Solution derived by the 
pipeline, the expected position of each refer- 
ence spectrum line in each slit is computed. 
Such expected position is then compared to 
the actual arc line position as measured from 
the raw data and the difference between ex- 
pected and observed position is computed. 
For each slit, the RMS of such differences is 
also computed. The quality control is suc- 
cessfull when all the following conditions are 



satisfied: 

• the median of the RMS distribution us- 
ing all slits is not larger than 0.2 pixels; 

• for each slit, the RMS is not higher 
than 0.1 pixels and lower than 0.3 pix- 
els. This condition must be satisfied at 
least by 90% of the slits. 

• in each slit, the minimum number of arc 
lines used to fit the Inverse Dispersion 
Solution is at least 9. 

• the bluest and reddest visible arc lines 
are within 2.5a from the best fit for at 
leat 90% of the slits 

Detected targets. Once data have been re- 
duced and monodimensional spectra ex- 
tracted, the number of detected targets is 
computed. In general, given the exposure 
time and the limiting magnitude of the sur- 
vey, we expect a detection rate above 90%. 
If such threshold is not reached, it is usually 
the signal that the metal mask, on which 
the slits are carved, was badly positioned 
on the focal plane (an event which m ay oc- 
cur, see also Hammerslev et al. 2010l ) or of 



bad observing conditions. In this last case, 
also the observation quality flag (see below) 
independently indicates bad quality data. 

4. Quality flag. When all exposures belonging 
to the same pointing have been reduced and 
combined together, a check on some envi- 
ronmental parameters which can affect the 
quality of the data is performed (see Garilli 
et al. 2008 for details). In particular, we 
check the mean PSF as measured from the 
reduced image, the measured sky brightness, 
and the object centering in the slit. These 
three quality parameters can score 1 (good) 
or (bad) and they are combined together 
in order to produce a final reduction quality 
flag 

7.3. Redshift measurement 

Once the data are fully reduced, they are in- 
gested into a blind redshift measure ment pipeline 



provid ed by EZ, fully described in iGarilli et al 



(|2010l ). EZ has been developed within the VVDS 
project to help in redshift measurement from op- 
tical spectra. The basic idea is to allow the user 
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to combine the available functions in the most ap- 
propriate way for the data at hand, thus building 
new user defined functions and methods. At the 
upmost level, a redshift measurement decision tree 
can be built, which mimicks the decision path fol- 
lowed by an astronomer to get to the measure of 
the redshift. Complete automation of the redshift 
measurement process can be tricky when spectra 
are noisy (as they always are at the faint limit of 
a survey) or in presence of artifacts such as fring- 
ing correction residuals, so that it is by no means 
guaranteed, a priori, that the best solution pro- 
posed is also a correct solution. For this reason, 
EZ also computes a reliability flag which summa- 
rizes the goodness of the solution proposed. As for 
the redshift, also the reliability flag computation is 
performed mimicking the kind of logical reasoning 
applied by an astronomer when trying to evaluate 
if a redshift is reliable or not. The implemented 
flagging sy stem is rathe r cons ervative, as demon- 
strated in Garilli et al. (|2010l) . EZ can be used 
both interactively, or totally blindly in unsuper- 
vised mode, which is the mode we have adopted 
within Easylife. 

The redshifts thus obtained are compared for 
consistency with the photometric redshifts, and an 
approprate decimal flag is added to the reliability 
flag provided by EZ. This particular flnal step of 
the reduction is applicable in the case of VIPERS, 
but could be not applicable in other cases. The 
modularity of Easylife allows to switch on or off 
any reduction step, according to the needs. The 
final redshifts approval is formalized after a hu- 
man check. The reduced data are submitted to 
the survey team members who are in charge of 
the redshift validation, who have at their disposal: 
the mono and two-dimensional object spectra, to- 
gether with their associated sky and noise spectra, 
the output of the automatic measurement, with 
associated flag, and the information whether such 
measurement agrees or not with the photomet- 
ric redshift (within the photometric redshift er- 
ror). In section |8] we will show how the automatic 
redshift measurements performs on the VIPERS 
data. 

7.4. Survey Database 

Once redshifts have been humanly validated 
and uploaded to the survey web site, they auto- 
matically enter the spectroscopic database, to- 



gether with the other scientiflcally interesting 
quantities such as the object magnitude in the 
selection band and its coordinates. The database 
also hosts the parent photometric catalog, con- 
taining ugriz magnitudes from the CFHTLS sur- 
vey and photometric redshifts. Periodically (tipi- 
cally on a yearly basis) the spectroscopic catalog 
is frozen in a release, which is made available to 
the whole team for scientiflc analysis and checks. 

Easylife allows one to access the photomet- 
ric parent catalogs and the spectroscopic catalogs 
through an embedded DA RT Web inter f ace in - 

(I2OO8I) . 
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stallation. As described in 
DART gives a per-user access to the data allow- 
ing to query catalogs, fllter data by placing con- 
ditions on the column values (even complex ex- 
pressions), view the results and export them to 
private user flies stored in the remote data server. 
DART also allows to make simple plots or retrieve 
the data products related to the catalogs, as the 
mono-dimensional spectra resulting from the re- 
duction process or any other ancillary data prod- 
uct (image thumbnails of different bands, links to 
external web sites, documents, etc.). The software 
supports access to more than one catalog at a time 
(e.g. for multi-band usage): 

• in parallel, namely querying each catalog 
singularly at the same time; 

• as a couple linked by a pre-built correlation 
table released by the data managers; 

• as a single virtual table, which allows to view 
the result of the pure correlation by objects 
ID among several catalogs ; 

DART supports also IVOA SSA protocol for 
the spectra access, IVOA SIA protocol for im- 
ages access and ConeSearch protocol for catalogs 
access (|http: / / www.ivoa.net /Documents / ) , allow- 
ing to open a gate towards the Virtual Observa- 
tory facilities for VIPERS data. DART allows 
to give different access privileges to different user 
classes, so that at the same time one can have a 
public part, a team reserved part, containing the 
most recent release, and a restricted part not yet 
released to the team, containing the data being 
accumulated in after the last team release. 
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8. Easylife performance 

The Easylife reduction blocks detailed above, 
coupled with the automatic redshift measurement, 
are very efficient: the full chain, including all 
the automatic quality checks, requires about 40 
minutes of computation time per pointing (each 
VIPERS pointing containing of the order of 320 
spectra), without supervision or human interven- 
tion. 

8.1. Data reduction performance 

Human intervention is required when one of the 
quality checks described in section [7.2.31 fails and 
the procedure is stopped. In Table [U we give the 
failure rate of the automatic reduction procedure 
we have experienced in the first 4x113=452 quad- 
rants of the VIPERS survey. For 92% of the ob- 
servations, the automatic reduction ran smoothly 
without human intervention and the data satisfied 
all quality checks. In only 2.5% of the data (i.e. 11 
quadrants) the automatic procedure has failed ei- 
ther to automatically locate spectra (9 quadrants) 
or to derive a good wavelength calibration solu- 
tion (within the limits set in the quality control 
configuration file) . 

The check which fails the most (5.5% of the times, 
i.e. 24 quadrants) is the one on the number of de- 
tected sources. Cross correlation of the quality 
parameter with these quadrants shows that in 
10 over 24 cases observing conditions below av- 
erage are responsible for the lower than average 
detection rate, while other observational hardware 
problems (e.g. guide lost during observation, field 
partially obscured by the guide probe, bad mask- 
insertion) account for the low detection rate of 10 
other quadrants. In only 4 cases (less than 1%) 
the low detection rate seems to be due to local 
problems in the photometric catalogue, affected 
by the presence of a bright star, or by a poor as- 
trometric solution when preparing masks, which 
may affect the corners of the field. Overall, our 
quality control proves to be reliable and allows us 
to quickly spot data below average quality. This 
information is not only useful per se but also to 
assign pointings for redshift measurement check: 
while higher quality data can be checked by one 
person only, the lower quality ones are systemati- 
cally looked at by two different people. 



8.2. Automatic redshift measurement per- 
formance 

All VIPERS redshifts have been manually vali- 
dated, as it had been done for the VVDS and the 
zCosm os surve ys (iLe Fevre et al. [(|2005l ) lLillv et al. 



(l20Q7l) V In iGariin et all (l2010l) . 



it has been 
showed that EZ, used in blind mode, had a mea- 
surement success rate of 95% on simulated data, 
while on the VVDS and zCosmos surveys the suc- 
cess rate was ^70% on the whole sample, rising 
to 90% for redshifts classified as very secure by 
astronomers. In Table [2] we summarize the results 
obtained on the first ^ 36000 detected targets be- 
longing to the 113 VIPERS pointings mentioned 
above. The redshift fiag scheme implemented in 
EZ mimicks the one adopted for the VVDS and 
the zCosmos surveys, i.e. 

• flag 4: a 100% secure redshift, with high 
SNR spectrum and obvious spectral features 
supporting the redshift measurement; 

• flag 3: a 90% secure redshift, strong spectral 
features; 

• flag 2: a 75% secure redshift measurement, 
several features in support of the measure- 
ment; 

• flag 9: only one secure single spectral fea- 
ture in emission, typically interpreted as 
[Oil] 3727 , or Ha. 

• flag 1: a 50% reliable redshift measurement, 
based on weak spectral features and contin- 
uum shape; 

• flag 0: no reliable redshift measurement pos- 
sible; 

In the table, results are subdivided by automatic 
reliability flag. For each redshift automatically 
measured by EZ, and for each automatic flag (col- 
umn 1), the table shows the number of spectra for 
which EZ has measured a redshift assigning that 
particular reliability flag (column 2), the number 
of spectra for which redshift has been confirmed 
by the astronomers (column 3), and the resulting 
success rate (column 4). The results shown in Ta- 
ble [5] are in line with those already obtained for 
the VVDS Wide survey: overall, the automatic 
measurement has been confirmed for 76% of the 
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spectra, confirmation rising to 94% for automatic 
flags 3 and 4. 

Table [2] also shows that the automatic flag is 
more res trictive than the h uman one, as already 
stated in lCarilh et all (|2010f ): 52% of the redshifts 



flagged as (unreliable) by EZ have been con- 
flrmed by astronomers. Table [3] compares the au- 
tomatically assigned flags with the human ones, 
when the automatic redshift has been confirmed. 
We can see that flags 3 and 4 have been conflrmed 
81% of the times, flags 2-9 55% and flags 1 19% 
of the times, while automatic flags became flags 
3-4 in 27% of the cases, again supporting the con- 
servativeness of the automatic flag assigment. 

9. Using Easylife for LBT data 

The modular approach of Easylife has allowed 
us to easily adapt it to other, totally different 
projects. Currently, it is used within the frame- 
work of the Italian LBT (Large Binocular Tele- 
scope, Hill fc Salinarilliggsl ) Data Center to reduce 
all spectroscopic observations obtained with ei- 
ther MODS (Multi-Object Double Spectrographs, 
Pogge et all I2OIOI ) or LUCI (LBT NIR spectro- 
scopic Utility with C amera and Integral- field unit, 
Mandel et all |2000[) during the Italian observing 
time. Being MODS a multiobject slit based spec- 
trograph operating in the visible range, similar to 
VIMOS in its concept, adaptation of the reduc- 
tion part has been straightforward, the required 
intervention being limited to the development of 
the instrument dedicated part of the Unpacker. 
LUCI is a multiobject spetrograph working in the 
NIR J,H and K bands. Therefore, on top of a 
dedicated Unpacker, some more work on the re- 
duction recipes has been performed, to comply 
with the specific peculiarities of the NIR spectro- 
scopic data (e.g. the much more delicate problem 
of the sky subtraction). But the main difference 
between the reduction center for a large scale sur- 
vey, like VIPERS, and the reduction center for a 
whole community, like the LBT Italian Data Cen- 
ter, resides in the different services the two cen- 
ters must provide. While in the frst case, data are 
acquired with the same instrument configuration, 
which makes reduction easier, but a number of 
other tasks are required (logging, data base, etc), 
in the second case data are acquired with a variety 
of instrument configurations, satisfying a variety 
of scientific needs, and the reduction chain must 



be able to cope with such diversities. On the other 
hand, the management part, as well as data prod- 
ucts distribution, is minimal: the only two actions 
required are to keep track of which data have been 
reduced and what remains to be done, on one side, 
and make available the reduced data to the Pis on 
the other side. In spite of these fundamental differ- 
ences, Easylife can handle both cases: in the LBT 
application, the WEB part has been suppressed, 
and the management data base is structured in a 
different way. The reduction chain is more versa- 
tile with several branches according to instrument 
mode, while the redshift measurement part is sup- 
pressed. Adaptation of Easylife from the VIPERS 
survey case to the LBT service data center case has 
required only few months work (mostly devoted 
to the implementation of the NIR dedicated re- 
duction recipes), thanks to the modular approach 
followed since the beginning, as well as to the care- 
full design of the basic architecture. 

10. Summary 

Easylife is the automatic data reduction and 
management system set up for the VIPERS sur- 
vey. Easylife allows to automatically reduce large 
amount of data in a timely way and performs reli- 
able quality controls on the data quality (namely 
the observing conditions) and on data reduction. 
The reduction chain ends with automated redshift 
measurements 

The automatic quality controls inserted in the 
pipeline have shown that reduction is successfull in 
> 95% of the cases, when observing conditions are 
within specifications. The observations not satis- 
fying the requested observing constraints are auto- 
matically spotted and account for the vast major- 
ity of automatic reduction failures. Easylife also 
comprises project support tools, a survey advance- 
ment logging system, and data access through a 
dedicated data base. The underlying EASE soft- 
ware environment adopted allows a smooth inter- 
action between the database, the core of the re- 
duction system and the publicly exposed web in- 
terface, as well as distributed computing on a be- 
owulf cluster. 

Presently, Easylife is also used in the framework 
of the LBT spectroscopic data reduction center, 
providing Pis with fully redu c ed an d calibrated 
spectra, see e.g. iMagrini et al.l (|2012l ). 
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Table 1 

Easylife performance on data reduction 



Failure reason 


failure rate 


Spectra location 


2% 


Wavelength calibration 


0.5% 


Target detection 


5.5% 


Total 


8% 



Table 2 

Automatic redshift measurement performance 



EZ flag 


total spectra 


correct redshifts 


success rate 


any 


35903 


27322 


76% 


3-4 


20043 


18889 


94% 


2 


2213 


1677 


76% 


9 


1548 


1188 


77% 


1 


2790 


1970 


71% 





6941 


3598 


52% 



Table 3 

Automatic and humanly assigned flags comparison 



human flag 



EZ flag 


3-4 


2-9 


1 


3-4 


81 % 


16 % 


3 % 


2-9 


32 % 


55 % 


13 % 


1 


40 % 


41 % 


19 % 





27% 


48 % 


25 % 
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Observations 



Science 




Lamp 


Frames 




Frame 



Instrument Calibrations 




Sl(y Subtraction 
Spectra 2D Exfractinu 
Sequence Combination 
Fringing Correction 
Spectra ID Extraction 



Fig. 2. — Block diagram summarizing the main 
steps involved in the reduction of VIMOS data 
(IScodeggio et al.ll2005l ) 



QI 02 Q3 Q4 



I Motes from preparation/reduction 

[ show comments from redshift validation 
jPreimaging submitted on period PG3 



jPreimaging done (Quality A) 
iMasii preparation assigned to OC 
25 May 09 



jMaslcdone on 29 May 09 



jspectroscopic OB submitted on period PB3 
[observed 

jlngested on 01 Oct 09 

Reduced on 02 Oct 09 

Assigned to CA and VLB 
Ql on OS Oct 09 

Cliecked out on 11 Dec 09 
I Finislied on 11 Dec 09 

Reduced on 02 Oct 09 

Assigned to CA and VLB 
Q2 on OS Oct 09 

Clieclied out on 11 Dec 09 

Finislied on 14 Dec 09 
I Reduced on 02 Oct 09 

Assigned to CA and VLB 
Q3 on 08 Oct 09 

Checlied out on 11 Dec 09 

Finished on 14 Dec 09 

I Reduced on 02 Oct 09 
' Assigned to CA and VLB 
q4 on OS Oct 09 

Cliecked out on 11 Dec 09 

Finislied on 14 Dec 09 



SEQUENCES 



observed on 19 Jui 09 
Airmass from 1.11 to 1.12 
Seeing from 0.75 to 0.91 
No moon 
Ql quality : 111 
Q2 quaiity : 110 
Q3 quality : 111 
Q4 quaiity : 110 

Ql: z failures, 1 undet. spectra 
Q2: D z failures, - undet. spectra 
Q3: z failures, 1 undet. spectra 
Q4: z failures, - undet. spectra 



Fig. 4. — Example of panel showing the observa- 
tion and reduction details for one pointing 




Fig. 1. — FASE architecture as implemented for the Easylife system. On the top part, we find the application 
and presentation layer. The bottom part shows the three main containers (Reducer, Unpacker and Organizer) 
with their respective components. Everything is linked together by the execution framework (EF) provided 
by an early prototype of FASE environment. 
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rtf 

VIPERS 



VIMOS PUBLIC EXTRAGALACTIC REDSHFT SURVEY 
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Fig. 3. — Example of panel showing the status of the observations. The graph displays the pointings placing 
them in the correct coordinates and coloring each pointing with a different color depending on its current 
status. The status ranges from pre- imaging submitting up to data validation assignment, with a final status 
assigned when the processing of the pointing has been definitely closed. 
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